Generalization Ability of Online Strongly Convex Learning Algorithms

نویسنده

  • John Wieting
چکیده

Online learning, in contrast to batch learning, occurs in a sequence of rounds. At the beginning of a round, an example is presented to the learning algorithm, the learning algorithm uses its current hypothesis to label the example, and then the learning algorithm is presented with the correct label and the hypothesis is updated. It is a different learning paradigm than batch learning where we are given all of our data at once, and we aim to construct a single optimal hypothesis using the entire data set. We hope that the resulting hypothesis will generalize well to unseen data. In online learning, our goal is to minimize the total loss along the entire sequence of training examples and we generate a new hypothesis with nearly every training example. Online learning can be motivated from situations where it is not feasible or desirable to utilize a batch learning approach. Examples could be situations where there is a huge amount of data where storing and learning from all of it is computationally unfeasible, or perhaps when the distribution generating the data is changing i.e. during the sequence more than one hypothesis is generating the data. Recently, statistical learning machinery has been used to analyze this paradigm. In [1] the generalization ability of convex functions was analyzed and [2] extends this work by investigating online algorithms with strongly convex loss functions. This analysis can be motivated due to the fact that there exists a large number of optimization problems in machine learning that are strongly convex. For instance all problems that use a log-loss or square-loss loss function or those who use a convex loss function, that is not necessarily strongly convex, and use L2 regularization or another strongly convex regularizer. The latter case describes the SVM problem which uses a convex loss function (hinge loss) with L2 regularization. This paper will discuss [2] and examine the paper through the lens of our CS 598 course. It will also discuss the main application of this paper, which is bounding the convergence rate of the Pegasos algorithm [4], with high probability.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the Generalization Ability of Online Strongly Convex Programming Algorithms

This paper examines the generalization properties of online convex programming algorithms when the loss function is Lipschitz and strongly convex. Our main result is a sharp bound, that holds with high probability, on the excess risk of the output of an online algorithm in terms of the average regret. This allows one to use recent algorithms with logarithmic cumulative regret guarantees to achi...

متن کامل

Applications of strong convexity--strong smoothness duality to learning with matrices

It is known that a function is strongly convex with respect to some norm if and only if its conjugate function is strongly smooth with respect to the dual norm. This result has already been found to be a key component in deriving and analyzing several learning algorithms. Utilizing this duality, we isolate a single inequality which seamlessly implies both generalization bounds and online regret...

متن کامل

On the Generalization Ability of Online Learning Algorithms for Pairwise Loss Functions

In this paper, we study the generalization properties of online learning based stochastic methods for supervised learning problems where the loss function is dependent on more than one training sample (e.g., metric learning, ranking). We present a generic decoupling technique that enables us to provide Rademacher complexity-based generalization error bounds. Our bounds are in general tighter th...

متن کامل

On the duality of strong convexity and strong smoothness: Learning applications and matrix regularization

We show that a function is strongly convex with respect to some norm if and only if its conjugate function is strongly smooth with respect to the dual norm. This result has already been found to be a key component in deriving and analyzing several learning algorithms. Utilizing this duality, we isolate a single inequality which seamlessly implies both generalization bounds and online regret bou...

متن کامل

A Modular Analysis of Adaptive (Non-)Convex Optimization: Optimism, Composite Objectives, and Variational Bounds

Recently, much work has been done on extending the scope of online learning and incremental stochastic optimization algorithms. In this paper we contribute to this effort in two ways: First, based on a new regret decomposition and a generalization of Bregman divergences, we provide a self-contained, modular analysis of the two workhorses of online learning: (general) adaptive versions of Mirror...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013